86 research outputs found

    Applying test case prioritization to software microbenchmarks

    Get PDF
    Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

    Applying test case prioritization to software microbenchmarks

    Full text link
    Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

    On-the-Fly Syntax Highlighting using Neural Networks

    Full text link
    With the presence of online collaborative tools for software developers, source code is shared and consulted frequently, from code viewers to merge requests and code snippets. Typically, code highlighting quality in such scenarios is sacrificed in favor of system responsiveness. In these on-the-fly settings, performing a formal grammatical analysis of the source code is not only expensive, but also intractable for the many times the input is an invalid derivation of the language. Indeed, current popular highlighters heavily rely on a system of regular expressions, typically far from the specification of the language's lexer. Due to their complexity, regular expressions need to be periodically updated as more feedback is collected from the users and their design unwelcome the detection of more complex language formations. This paper delivers a deep learning-based approach suitable for on-the-fly grammatical code highlighting of correct and incorrect language derivations, such as code viewers and snippets. It focuses on alleviating the burden on the developers, who can reuse the language's parsing strategy to produce the desired highlighting specification. Moreover, this approach is compared to nowadays online syntax highlighting tools and formal methods in terms of accuracy and execution time, across different levels of grammatical coverage, for three mainstream programming languages. The results obtained show how the proposed approach can consistently achieve near-perfect accuracy in its predictions, thereby outperforming regular expression-based strategies.Comment: Accepted for publication in the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022

    Automated Reporting of Anti-Patterns and Decay in Continuous Integration

    Full text link
    Continuous Integration (CI) is a widely-used software engineering practice. The software is continuously built so that changes can be easily integrated and issues such as unmet quality goals or style inconsistencies get detected early. Unfortunately, it is not only hard to introduce CI into an existing project, but it is also challenging to live up to the CI principles when facing tough deadlines or business decisions. Previous work has identified common anti-patterns that reduce the promised benefits of CI. Typically, these anti-patterns slowly creep into a project over time before they are identified. We argue that automated detection can help with early identification and prevent such a process decay. In this work, we further analyze this assumption and survey 124 developers about CI anti-patterns. From the results, we build CI-Odor, a reporting tool for CI processes that detects the existence of four relevant anti-patterns by analyzing regular build logs and repository information. In a study on the 18,474 build logs of 36 popular Java projects, we reveal the presence of 3,823 high-severity warnings spread across projects. We validate our reports in a survey among 13 original developers of these projects and through general feedback from 42 developers that confirm the relevance of our reports

    A large-scale empirical exploration on refactoring activities in open source software projects

    Get PDF
    Refactoring is a well-established practice that aims at improving the internal structure of a software system without changing its external behavior. Existing literature provides evidence of how and why developers perform refactoring in practice. In this paper, we continue on this line of research by performing a large-scale empirical analysis of refactoring practices in 200 open source systems. Specifically, we analyze the change history of these systems at commit level to investigate: (i) whether developers perform refactoring operations and, if so, which are more diffused and (ii) when refactoring operations are applied, and (iii) which are the main developer-oriented factors leading to refactoring. Based on our results, future research can focus on enabling automatic support for less frequent refactorings and on recommending refactorings based on the developer's workload, project's maturity and developer's commitment to the project

    Exploiting natural language structures in software informal documentation

    Get PDF
    Ā© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Communication means, such as issue trackers, mailing lists, Q&A forums, and app reviews, are premier means of collaboration among developers, and between developers and end-users. Analyzing such sources of information is crucial to build recommenders for developers, for example suggesting experts, re-documenting source code, or transforming user feedback in maintenance and evolution strategies for developers. To ease this analysis, in previous work we proposed DECA (Development Emails Content Analyzer), a tool based on Natural Language Parsing that classifies with high precision development emails' fragments according to their purpose. However, DECA has to be trained through a manual tagging of relevant patterns, which is often effort-intensive, error-prone and requires specific expertise in natural language parsing. In this paper, we first show, with a study involving Master's and Ph.D. students, the extent to which producing rules for identifying such patterns requires effort, depending on the nature and complexity of patterns. Then, we propose an approach, named NEON (Nlp-based softwarE dOcumentation aNalyzer), that automatically mines such rules, minimizing the manual effort. We assess the performances of NEON in the analysis and classification of mobile app reviews, developers discussions, and issues. NEON simplifies the patterns' identification and rules' definition processes, allowing a savings of more than 70% of the time otherwise spent on performing such activities manually. Results also show that NEON-generated rules are close to the manually identified ones, achieving comparable recall

    Branch coverage prediction in automated testing

    Get PDF
    This is the peer reviewed version which has been published in final form at [DOI]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed, and moreover, new test cases should be generated for the new source code. This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by testā€case generation. We argue that knowing a priori the branch coverage that can be achieved with testā€data generation tools can help developers into taking informed decision about those issues. In this paper, we investigate the possibility to use sourceā€code metrics to predict the coverage achieved by testā€data generation tools. We use four different categories of sourceā€code features and assess the prediction on a large data set involving more than 3'000 Java classes. We compare different machine learning algorithms and conduct a fineā€grained feature analysis aimed at investigating the factors that most impact the prediction accuracy. Moreover, we extend our investigation to four different search budgets. Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested crossā€validation over the different budgets, respectively, on EVOSUITE and RANDOOP. Finally, the discussion of the results demonstrate the relevance of couplingā€related features for the prediction accuracy

    Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    Get PDF
    Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype

    Publisher Correction: Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    Get PDF
    A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper

    A Comparison of RDB-to-RDF Mapping Languages

    Full text link
    Mapping Relational Databases (RDB) to RDF is an active field of research. The majority of data on the current Web is stored in RDBs. Therefore, bridging the conceptual gap between the relational model and RDF is needed to make the data available on the Semantic Web. In addition, recent research has shown that Semantic Web technologies are useful beyond the Web, especially if data from different sources has to be exchanged or integrated. Many mapping languages and approaches were explored leading to the ongoing standardization effort of the World Wide Web Consortium (W3C) carried out in the RDB2RDF Working Group (WG). The goal and contribution of this paper is to provide a feature-based comparison of the state-of-the-art RDB-to-RDF mapping languages. It should act as a guide in selecting a RDB-to-RDF mapping language for a given application scenario and its requirements w.r.t. mapping features. Our comparison framework is based on use cases and requirements for mapping RDBs to RDF as identified by the RDB2RDF WG. We apply this comparison framework to the state-of-the-art RDB-to-RDF mapping languages and report the findings in this paper. As a result, our classification proposes four categories of mapping languages: direct mapping, read-only general-purpose mapping, read-write general-purpose mapping, and special-purpose mapping. We further provide recommendations for selecting a mapping language
    • ā€¦
    corecore